Add ResearchClawBench eval framework by black-yt · Pull Request #2174 · huggingface/huggingface.js

black-yt · 2026-05-15T05:28:58Z

Summary

Adds researchclawbench to the supported evaluation frameworks for benchmark dataset eval.yaml files.

ResearchClawBench is an end-to-end scientific research benchmark for AI agents and standalone LLMs, covering workflows from reading raw data and related work to producing code, figures, and publication-style reports.

Dataset prepared for the Hub Evaluation Results feature:
https://huggingface.co/datasets/InternScience/ResearchClawBench

The dataset repo already includes:

eval.yaml with evaluation_framework: researchclawbench
.eval_results/*.yaml entries following the benchmark result format

Reference similar benchmark setup:
https://huggingface.co/datasets/claw-eval/Claw-Eval

Change

Add researchclawbench to EVALUATION_FRAMEWORKS in packages/tasks/src/eval.ts.

Notes

This is intended to allow the ResearchClawBench dataset to be recognized as a Benchmark dataset and display the benchmark leaderboard/tag on the Hub.

Note

Low Risk
Low risk: adds a new entry to a static EVALUATION_FRAMEWORKS registry with no changes to execution flow or data handling.

Overview
Adds ResearchClawBench to the EVALUATION_FRAMEWORKS map in packages/tasks/src/eval.ts, enabling benchmark datasets to declare evaluation_framework: researchclawbench and be recognized accordingly.

^{Reviewed by Cursor Bugbot for commit 36c6e24. Bugbot is set up for automated code reviews on this repo. Configure here.}

black-yt · 2026-05-19T01:04:19Z

@SBrandeis @Wauplin @gary149 @julien-c @ngxson @pcuenca

Just following up on this PR in case it was missed.

This change only adds researchclawbench to the supported evaluation frameworks so that the dataset can be recognized as a Benchmark dataset on the Hub and display the benchmark leaderboard/tag correctly.

The dataset repo and evaluation result files are already prepared:

https://huggingface.co/datasets/InternScience/ResearchClawBench

Please let me know if there are any additional requirements or adjustments needed from my side. Thanks!

krampstudio

seen with @NathanHB

Add ResearchClawBench eval framework

636b49e

black-yt requested review from SBrandeis, Wauplin, gary149, julien-c, ngxson and pcuenca as code owners May 15, 2026 05:29

Merge branch 'main' into add-researchclawbench-eval-framework

fcb0f02

Merge branch 'main' into add-researchclawbench-eval-framework

36c6e24

krampstudio approved these changes May 27, 2026

View reviewed changes

krampstudio merged commit 7ef4d94 into huggingface:main May 27, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ResearchClawBench eval framework#2174

Add ResearchClawBench eval framework#2174
krampstudio merged 3 commits into
huggingface:mainfrom
black-yt:add-researchclawbench-eval-framework

black-yt commented May 15, 2026 •

edited by cursor Bot

Loading

Uh oh!

black-yt commented May 19, 2026

Uh oh!

krampstudio left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

black-yt commented May 15, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change

Notes

Uh oh!

black-yt commented May 19, 2026

Uh oh!

krampstudio left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

black-yt commented May 15, 2026 •

edited by cursor Bot

Loading